Unsupervised Domain Tuning to Improve Word Sense Disambiguation

نویسندگان

  • Judita Preiss
  • Mark Stevenson
چکیده

The topic of a document can prove to be useful information for Word Sense Disambiguation (WSD) since certain meanings tend to be associated with particular topics. This paper presents an LDA-based approach for WSD, which is trained using any available WSD system to establish a sense per (Latent Dirichlet allocation based) topic. The technique is tested using three unsupervised and one supervised WSD algorithms within the SPORT and FINANCE domains giving a performance increase each time, suggesting that the technique may be useful to improve the performance of any available WSD system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Improved Approach for Word Ambiguity Removal

Word ambiguity removal is a task of removing ambiguity from a word, i.e. correct sense of word is identified from ambiguous sentences. This paper describes a model that uses Part of Speech tagger and three categories for word sense disambiguation (WSD). Human Computer Interaction is very needful to improve interactions between users and computers. For this, the Supervised and Unsupervised metho...

متن کامل

Imegrating Domain and Paradigmatic Similarity for unsupervised Sense Tagging

An unsupervised methodology for Word Sense Disambiguation, called Dynamic Domain Sense Tagging, is presented. It relies on the convergence of two very well known unsupervised approaches (i.e. Domain Driven Disambiguation and Conceptual Density). For each target word a domain is dynamically modeled by expanding the its topical context, i.e. a set of words evoking the underlying/implict domain wh...

متن کامل

Domain Specific Sense Disambiguation with Unsupervised Methods

Most approaches in sense disambiguation have been restricted to supervised training over manually annotated, non-technical, English corpora. Application to a new language or technical domain requires extensive manual annotation of appropriate training corpora. As this is both expensive and inefficient, unsupervised methods are to be preferred, specifically in technical domains such as medicine....

متن کامل

Unsupervised Domain Relevance Estimation for Word Sense Disambiguation

This paper presents Domain Relevance Estimation (DRE), a fully unsupervised text categorization technique based on the statistical estimation of the relevance of a text with respect to a certain category. We use a pre-defined set of categories (we call them domains) which have been previously associated to WORDNET word senses. Given a certain domain, DRE distinguishes between relevant and non-r...

متن کامل

HIT-CIR: An Unsupervised WSD System Based on Domain Most Frequent Sense Estimation

This paper presents an unsupervised system for all-word domain specific word sense disambiguation task. This system tags target word with the most frequent sense which is estimated using a thesaurus and the word distribution information in the domain. The thesaurus is automatically constructed from bilingual parallel corpus using paraphrase technique. The recall of this system is 43.5% on SemEv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013